Ranked Join Indices
نویسندگان
چکیده
A plethora of data sources contain data entities that could be ordered according to a variety of attributes associated with the entities. Such orderings result effectively in a ranking of the entities according to the values in the attribute domain. Commonly, users correlate such sources for query processing purposes through join operations. In query processing, it is desirable to incorporate user preferences towards specific attributes or their values. A way to incorporate such preferences, is by utilizing scoring functions that combine user preferences and attribute values and return a numerical score for each tuple in the join result. Then, a target query, which we refer to as top-k join query, seeks to identify the tuples in the join result with the highest scores. In this paper, we propose a novel technique, which we refer to as ranked join index, to efficiently answer top-k join queries for arbitrary, user specified, preferences and a large class of scoring functions. Our rank join index requires small space (compared to the entire join result) and provides guarantees for its performance. Moreover, our proposal provides a graceful tradeoff between its space requirements and worst case search performance. We supplement our analytical results, with a thorough experimental evaluation using a variety of real and synthetic data sets, demonstrating that in comparison to other viable approaches, our technique offers significant performance benefits.
منابع مشابه
Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs
In many applications, the underlying data (the web, an XML document, or a relational database) can be seen as a graph. These graphs may be enriched with weights, associated with the nodes and edges of the graph, denoting application specific desirability/penalty assessments, such as popularity, trust, or cost. A particular challenge when considering such weights in query processing is that resu...
متن کاملDistance-Associated Join Indices for Spatial Range Search
Spatial join indices are join indices constructed for spatial objects. Similar to join indices for relational database systems, spatial join indices improve efficiency of spatial join operations. In this paper, a distanceassociated join index structure is developed to speed up spatial queries especially for spatial range queries. Three distance-associated join indexing mechanisms: basic, ring-s...
متن کاملAccelerating Spatial Join Operations using Bit-Indices
Spatial join is a very expensive operation in spatial databases. In this paper, we propose an innovative method for accelerating spatial join operations using Spatial Join Bitmap (SJB) indices. The SJB indices are used to keep track of intersecting entities in the joining data sets. We provide algorithms for constructing SJB indices and for maintaining the SJB indices when the data sets are upd...
متن کاملA Data Mining Approach for selecting Bitmap Join Indices
Index selection is one of the most important decisions to take in the physical design of relational data warehouses. Indices reduce significantly the cost of processing complex OLAP queries, but require storage cost and induce maintenance overhead. Two main types of indices are available: mono-attribute indices (e.g., B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap...
متن کاملAn Efficient Multi Join Algorithm Utilizing a Lattice of Double Indices
In this paper, a novel multi join algorithm to join multiple relations will be introduced. The novel algorithm is based on a hashed-based join algorithm of two relations to produce a double index. This is done by scanning the two relations once. But instead of moving the records into buckets, a double index will be built. This will eliminate the collision that can happen from a complete hash al...
متن کامل